Toward Robust Action Retrieval in Video

نویسندگان

  • Samy Bakheet
  • Ayoub Al-Hamadi
  • Bernd Michaelis
  • Usama Sayed
چکیده

Visual recognition and interpretation of human-induced actions and events are among the most active research areas in computer vision, pattern recognition, and image understanding [4]. In this work, we develop such a framework for robustly recognizing human actions in video sequences. The contribution of the paper is twofold. First a reliable neural model, the Multi-level Sigmoidal Neural Network (MSNN) as a classifier for the task of action recognition is presented. Second we unfold how the temporal shape variations can be accurately captured based on both temporal self-similarities and fuzzy log-polar histograms. Neural network classifier has many advantages over other competitive machine learning classifiers. Some of these advantages include the high rapidity, easiness of training, realistic generalization capability, high selectivity and great capability to create arbitrary partitions of feature space. However, the neural model, in the standard form, might have low classification accuracy and poor generalization properties because its neurons employ a standard bi-level function that gives only two values (i.e., binary responses) [2]. To relax this restriction and allow the neurons to generate multiple responses, a new functional extension for the standard sigmoidal functions should be developed. This extension is termed the Multi-level Activation Function, and the model that uses this extension is termed the Multi-level Sigmoidal Neural Network (MSNN). It is straightforward to derive a multi-level version from a given bi-level standard sigmoidal activation function f (x) as follows

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust multiplicative video watermarking using statistical modeling

The present paper is intended to present a robust multiplicative video watermarking scheme. In this regard, the video signal is segmented into 3-D blocks like cubes, and then, the 3-D wavelet transform is applied to each block. The low frequency components of the wavelet coefficients are then used for data embedding to make the process robust against both malicious and unintentional attacks. Th...

متن کامل

Action Change Detection in Video Based on HOG

Background and Objectives: Action recognition, as the processes of labeling an unknown action of a query video, is a challenging problem, due to the event complexity, variations in imaging conditions, and intra- and inter-individual action-variability. A number of solutions proposed to solve action recognition problem. Many of these frameworks suppose that each video sequence includes only one ...

متن کامل

Retrieval Method for Video Content in Different Format Based on Spatiotemporal Features

In this paper a robust video content retrieval method based on spatiotemporal features is proposed. To date, most video retrieval methods are using the character of video key frames. This kind of frame based methods is not robust enough for different video format. With our method, the temporal variation of visual information is presented using spatiotemporal slice. Then the DCT is used to extra...

متن کامل

Mining spatiotemporal video patterns towards robust action retrieval

In this paper, we present a spatiotemporal co-location video pattern mining approach with application to robust action retrieval in YouTube videos. First, we introduce an attention shift scheme to detect and partition the focused human actions from YouTube videos, which is based upon the visual saliency [13] modeling together with both the face [35] and body [32] detectors. From the segmented s...

متن کامل

Shot-boundary detection: unraveled and resolved?

Partitioning a video sequence into shots is the first step toward video-content analysis and content-based video browsing and retrieval. A video shot is defined as a series of interrelated consecutive frames taken contiguously by a single camera and representing a continuous action in time and space. As such, shots are considered to be the primitives for higher level content analysis, indexing,...

متن کامل

Reflective Teaching in the Context of a Video Club: Nurturing Professional Relationships and Building a Learner Community

The purpose of this study was to examine how four teachers used the seven processes of videotape analysis to develop an analytic approach and reflective thinking towards their teaching. The study was organized within video clubs and was used to describe the interactions among four teachers about their experiences at a language institute. Data were gathered through videotaped recordings of lesso...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010